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A GUIDE TO THE EVALUATION OF PILOT PROGRAMS 



This document is designed for the beginner in practical evaluation who 
needs a point of departure, especially in dealing with pilot programs or 
projects. For many reasons, evaluation has suddenly achieved a new promin- 
ence, Yet the current state of the art of evaluation leaves many educators 
somewhat bewildered as to its application. Numerous evaluation models, such 
as the Tylerian model, CIPP from Ohio University, EPIC from Arizona, Stake's 
composite-goal model, and others are presently being advocated across the 
country-each worthy of consideration in its own right. This guide makes 
no attempt to adhere to any particular model but instead draws from several 
of them in an attempt to provide a frame of reference for a workable approach,. 
This deliberate avoidance of the use of any particular model is' based upon 
the premise that too much confidence in any one model causes practitioners 
to become complacent in seeking newer and better approaches. 

WHAT IS EVALUATION ? 

The purpose of an educational pilot program is to determine whether an 
activity will lead to increased knowledge and skills for those exposed to 
the program. If those exposed demonstrate significant gains as a result of 
the program, then it is expected to be incorporated into the overall curric- 
ulum on a permanent basis. Obviously, a pilot program enables us to test a 
particular activity without making a full-scale commitment. 

Evaluation is used to ascertain the effectiveness of a pilot program. 
Evaluation should be regarded as a professional tool which is employed as 
a means of encouraging program modification and revision as well as either 
the abandonment of the program altogether or the adoption of the activity as 



a permanent part of the educational enterprise. Evaluation in its simplest 
sense can be considered as the collection of data for the purpose of making 
decis ions . 

WHY EVALUATE ? 

An evaluation plan should not be included in a project proposal solely 
because it is a legal requirement or mandated by the guidelines. Decision- 
makers in the school must really believe that evaluation is worth the time 
and effort required. Yet, we cannot assume that a new and promising approach 
confirmed on the basi$ of a good evaluation plan will automatically survive 
and become a permanent part of the educational enterprise in a particular 
school system. Careful consideration must be given, prior to any venture 
into a pilot project, to the past history of the change process 'in a partic- 
ular school setting relative to the following questions: 

1 . ■ What are the mechanisms for change which exist within the system? 

2. How prone to change is the system? 

3. How will the local decision-makers react to positive results? 

Some provision must be anticipated for the adoption or expansion of 

a worthwhile activity well in advance, probably at the time a decision is 
made to submit a project proposal. Otherwise, the whole purpose of a pilot 
program is defeated before the venture begins. 

WHEN DO YOU BEGIN TO EVALUATE ? 

The development of an evaluation plan relative to any pilot program 
either begins when you first begin to think about the program, or even 
before. You may beasking yourself how the latter is possible. Sometimes 
in the execution of a feasibility study or survey to identify and support 
unmet needs, some very pertinent data will be gathered. Some of this baseline 
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or benchmark data can be very helpful in determining the direction of the 
change process, both during the actual implementation of the program as 
well as in determining the extent to which specified objectives have been 
reached. These data are very helpful in revealing any condition which may 
have existed prior to the program activity and which are related to the 
outcomes . 

Thus, evaluation must be an integral part of a pilot program. Evalu- 
ation needs to be designed into the program. It should not come as an 
after- thought to an already existing program, but must become a basic part 
of the program itself. 

CAN AN EVALUATION BE OBJECTIVE ? 

One of the primary factors which has contributed to the sloW growth of 
formal evaluation as a professional tool is the sensitivity- educators have 
to criticism. Although the '‘politics 11 of evaluation is a very interesting 
subject, it will only be treated briefly here. Evaluation should and must 
be approached objectively. An evaluator should not set out to prove any 
particular point of view. The evaluation process should be executed sincerely 
with no preconceived notions of what the end results should be or must be. 

Too often educators have so much professional pride and prejudice invested 
in a program that they are very reluctant to accept any objective evidence 
which does not support their own convictions. The state of the art of formal 
evaluation will only begin to improve at an increased -rate when we are able 
to reduce this type of resistance to objectivity to a bare .minimum. 

WHO SHOULD EVALUATE ? 

A decision as to whether the pilot program will be evaluated internally, 
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externally, or through a combination of both approaches is influenced by 
many factors. For this purpose, internal evaluation is considered to be 



that which is conducted by the personnel directly responsible for executing 
the program. Whereas, external evaluation is that which is performed by an 
independent contractor, such as a university or some other non-profit agency. 
In any case it is the project director's job to determine who can supply the 
best information on the effectiveness of the program based upon the avail- 
ability of fiscal resources, the experience of the personnel, the general 
complexity of the pilot program, and so forth. A further brief analysis 
of external evaluation may serve to clarify the ramifications of this deci- 
s ion . 

The general "rule of thumb" in estimating the costs of providing for 
an evaluation by an external agency falls within, but is not limited to, 

2-5 percent range of the total cost of the project. Some of the pertinent 
factors which will affect the costs are enumerated below: 

1. The general complexity of the evaluation plan. 

2. The number of measuring instruments the contractor must develop. 

3. The source and experience of the personnel. 

4. The agency which has the major responsibility for data collection. 

5. The amount of travel involved by the evaluators. 

6. The type of final report desired. 

In arriving at an agreement with an independent agency, the program 
administrator should request a letter from the contractor which includes 
the following: 

1. An enumeration of what the evaluator will do. 

2. An enumeration of what the pilot program personnel will do. 

3. The contractor's charge for the development of any measuring 
instruments . 

4. Specifications of the caliber and experience of the personnel 
involved. 
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5. A schedule of payment over a definite period of time, such as: 

a. \ when contract is negotiated 

b. \ three months later 

c. \ six months later 

d. ^ after delivery of the final report 



HOW GAN THE SCOPE AND DIRECTION OF THE EVALUATION BE 1 DETERMINED ? 

In the development of an evaluation plan there are several early deci- 
sions the evaluator must make which will determine to a large extent the 
direction of the evaluation. For the most part many of these decisions will 
be judgmental in nature based upon the general complexity of the pilot program. 
Some of the questions which will have to be answered before decisions can be 
made include the following: 

1. Is this evaluation to be undertaken within a single program 
or as a comparison between two or more programs? 

2. Will this evaluation measure outcomes alone? 

3. . Or, will. the evaluation also attempt to consider the conditions 

existing prior to the program which may relate to the outcomes as 
well as the many encounters that occur as the program progresses? 

4. What are the variables which will be selected for evaluation as 
indicated by the objectives of the program? 

5. Can the variables referred to in questions 3 and 4 above be 
stated as measurable objectives? 

6. What are the costs involved in implementing each of the alterna- 
tives of the decision? 

7. Do the project guidelines encourage local directors to include 
line items in the budget for evaluation? 

8. For what audiences will the final evaluation report be prepared? 
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SOME VARIABLES TO CONSIDER IN A PILOT PROGRAM 



I. COMMUNITY CHARACTERISTICS 



1. 


Population 




6. 


Literacy 


2. 


Location 




7. 


Ethnic Groups 


3. 


Occupations 




8. 


Dwellings 


4. 


Unemp loymcn t 




9. 


Delinquency 


5. 


Welfare 


— 


10. 


Family Income 




II. SCHOOL 


CHARACTERISTICS 


1 . 


Per Capita Expenditures 




11. 


Teacher Qualifications 


2. 


Teachers 1 Salaries 




12. 


Teaching Experience 


3. 


Grade Levels 




13. 


Average Age Teachers 


4. 


Condition of Facilities 




14. 


Male-Female Ratio of Teachers 


5. 


Teacher- Pupil Ratio 




15. 


Teacher Turnover 


6. 


Pupil Enrollment 




16. 


Average Daily Attendance 


7. 


Grouping Practices 




17. 


Ethnic Groups 


8. 


Curriculum 




18. 


Male-Female Ratio, Pupils 


9. 


Services Available 




19. 


Efforts to Gain Public Acceptance 


10. 


Achievement Level 




20. 


Problems in Gaining Public 



Acceptance 

III. THE PROGRAM 



i. 


Additional Personnel 




19. 


Experimental Class 


2. 


Regular Staff 




20. 


Control Class 


3. 


Qualifications 




21. 


Major Segments of Program 


4. 


Experience 




22. 


Time Devoted Each Activity 


5. 


Duties 




23. • 


Pupils Involved 


6. 


Time Commitment 




24. 


Instructional Materials 


7. 


In-service Training 




25. 


Teacher Activity 


8. 


Male-Female Ratio 




26. 


Aide or Adult-Pupil Ratio 


9. 


Provisions of Services 




27. 


Teaching Methods 


10. 


Length of Program 




28. 


Motivation of Pupils 


11. 


Hours of Instruction 




29. 


Equipment and Materials Used 


• 12. 


Intervals Between Testing 




30. 


J.ns tructional Materials Development 


13. 


Teacher Meetings 




31. 


Parent and/or Community Development 


14. 


Purposes of Meetings 




32. 


Total Cost 


15. 


Location of Classes 




33. 


Broad Categories of Expense 


16. 


Physical Arrangements 




34. 


Normal Per-Pupil Cost 


17. 


. Grouping of Teachers 




35. 


Per-Pupil Cost of Program 


18. 


Grouping of Students 










IV. EVIDENCE OF 


CHANGE 


1. 


Changes of Achievement Measures 




11. 


Parallel Forms for Pre and Post- 


2. 


Changes of Method of Instruction 






testing 


3. 


Changes in Pupils 1 Attitudes 




12. 


Conditions When Measures Acquired 


4. ' 


Changes in Teachers" Attitudes 




13. 


Qualifications of Assessors or 


5. 


Selection of Control Group 






Observers 


6. 


Affect by Other Programs 




14. 


Special Training of Testers 


7. 


Attrition of Participants 




15. 


Quantify Data 


8. 


Pretest and Posttest Samples 




16. 


Graphically Display Data 


9. 


Characteristics of Experimental 




17. 


Basic for Comparison 




Sample 




18. 


Appropriateness of Statistical Tests 


10. 


Selection of Measuring Instru- 




19. 


Communication of Statistical 




ments 






Cone lus ions 








20. 


Educational Importance of Conclusions 
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HOW GAN MEASURABLE OBJECTIVES BE DEVELOPED ? 



One of the most important steps in the evaluation process is translating 
broad, vague goals into more specif fc objectives or outcomes. The process of 
determining the goals of a pilot program is primarily a rationale and judg- 
mental matter. These judgments may be made by various groups and arrived at 
by various means (e.g., conference and committee discussion). However, the 
initial goals which emerge do not usually provide a suitable basis for an 
evaluation plan, since they are usually expressed in very general terms. 

Goals expressed in general terms frequently are vague, convey different 
meanings to different people and, thus, are far removed from the practical 
operation of appraising. The difficult task is that of translating or 
expanding the global goals into measurable objectives which can serve as 
a useful framework for the appraisal. 

No attempt will be made herein to give full treatment to the translation 
of general goals into measurable objectives. Nevertheless, here are some 
pertinent points to remember in the translation or expansion process: 

1. Much professional judgment is involved. 

2. Two different groups of professionals will not necessarily 
construct the same measurable objectives for the same general 
goal. 

3. It is not necessary to have a one-to-one ratio or correspondence 
between general goals and measurable objectives. More often than 
not you will have several measurable objectives for each general 
goal. 

4. The translation or expansion procedure is actually a thinking 
process fortified with a few structural applications. 

5. Each measurable objective will probably include the following four 
component parts: 

a. The target (e.g., student teacher, administrator) 
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b. The type of behavior (e.g., cognitive, affective, 
psychomotor) . 

c. The content area (e.g,, subject, mode of instruction, 
materials ) . 

d. The method of measurement (e.g., standardized test, 
rating scale, checklist). 

HOW CAN EVALUATION ACTIVITIES BE SCHEDULED ? 

Let*s assume at this point in time that the evaluation plan is designed 
to collect data relative to input variables, program activity variables, and 
outcome variables, and that the technology does exist for conducting the 
evaluation or that the time and talent is available to develop the technology. 
Thus, the next thing we would want to consider is the pacing of the evaluation 
activity. The work should be reasonably well distributed so that it will not 
be congregated in such a way that the staff will be overburdened and detract 
from the quality of work. A practical approach in avoiding a congested sched- 
ule may consist of the following three-pronged approach. The first step would 
entail the enumeration of all of the input, program activity, and outcome 
variables which will be considered during the evaluation process. The follow- 
ing outline is a brief illustration of this procedure as it applies to a 
teacher preparation program: 

A, Input Variables 

1. Publicity 

a. Scope 

b. Effectiveness of Communication 

c. Timing 

2. Participants 

a. Biological and Professional Data 

b. Formal Courses 

c. Informal Activities 
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The second step consists of scheduling the data gathering activity. The follow 
ing partially-filled version of a table illustrates when the data identified in 
the first step will be gathered: 



DATA 




INPUT VARIABLES 


program variables 


OUTCOME VARIABLES 


A. 

i 

!_ . 

a- c 


A. 

2. 

a-c 


May *69 


Al 








June *69 




a 2 






July 1 69 










August 1 6 9 


















' 



In the third step the evaluation activities are identified by letters and a 
brief explanation given about the collection of the data. The following is 
an example of this step which is consistent with the activity alluded to in 
the first two steps: 

A. The scope, timing, and effectiveness of publicity will be 

assessed by the means of questionnaires or telephone surveys 
of a random sample of teachers in the field. In addition, 
participants will also be questioned about where they heard 
of the program, how, when, whether publicity was too late or 
appropriately timed, etc. 

Please be reminded at this point that the measuring instruments used by 
evaluators will include, but not be limited to, inventory schedules, biograph- 
ical data sheets, interview routines, checklists, opinionaires , and various 
kinds of psychometric tests. 
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HOW WILL THE DATA BE COLLECTED? 



Since evaluation is the collection of data for decision making, careful 
c ons idera t ion must be given to the handling of these data. There are at 
least two bases of judging the characteristics of a .program relative to the 
data gathered: 

1, With respect to standards as reflected by professional judgments, and 

2. With respect to relative standards as reflected by characteristics 
of alternate programs. 

The following table is designed to guide in the treatment of the data gathered 





EVALUATION 


DESIGNS 


DATA WITHIN PROJECT 


DATA OUTSIDE PROJECT 


1. 


Project by Comparison 
with Absolute Standard 


1. 


Project Group Compared 
with "National" Norm 


2. 


Pretes t- Post test 
Comparison 


2. 


Change in Project Group 
Compared with Change in 
Group During Previous 
Period 


3. 


Project Group Compared 
with a Projection 


3. 


Change in Project Group 
Compared to Change in 
Concurrent Control Group 
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HOW CAN PRE AND POSTTESTING BE FACILITATED? 



The Solomon Four-Group Design lends itself well to pre and posttesting 
in short term projects. The design is as follows: 



1. Experimental group 


Test i 


Program 


Test 2 


2. Control Group 


Test i 




Test 2 


3. Experimental Group 




Program 


Test 2 


4. Control Group 






Test 2 



In this design the experimental group participates in the pilot program and 
is compared to a group which does not participate. A special effort is made 
to reduoe any effect the pretest itself may be a cause of the piretest-post- 
test difference. There are four groups involved as illustrated in the table, 
two experimental and two control groups. The posttest is administered to all 
four groups; however, only one of the experimental and control groups are 
pretested. Obviously, a crucial element here is also the assignment to the 
groups . 
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HOW CAN THE MEASUREMENT OF PARTICIPANTS BE FACILITATED? 



If an evaluator wants to examine the effects of a pilot program it is 
not necessary for every student in the program to take every item in every 
test related- to the outcomes. Obtaining data on every participant on every 
item can be a waste of time and effort. To make decisions about programs 
one needs data on the program, not the individual. If a group is exposed to 
a program and a set of test items is developed to assess the effects of that 
program, then the information desired is how well the total group did on the 
entire set of items. The process known as matrix sampling is a practical 
approach which facilitates the achievement of this end. 

The fundamental idea of matrix sampling is this: every program partic- 

ipant (from a universe of participants) need not respond to every item 
(from a set of test items) in order to obtain estimates of the mean and 
variance (standard deviation) of the participants 1 responses to the set of 
test items. Matrix sampling involves the simultaneous and random sampling 
of both participants and test items. The most efficient use of this tech- 
nique involves different, non-overlapping samples of participants taking 
non-overlapping samples of test items. That is, one sample of participants 
take one sample of test items, a second sample of participants takes a 
second sample of items, and so forth. 

Perhaps the best way to describe how to use matrix sampling is to 
illustrate its use in a hypothetical situation. An evaluator desires to 
find out how well a particular class is doing. There are 500 students and he 
has constructed a 50- item test on the curriculum. This is what the evaluator 
does. He randomly divides the 50-item test into five parts of ten items each. 
He then randomly administers each group of ten-test items to five groups of 
100 students as depicted in the following: 



ERIC 
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STUDENT SAMPLE 
(100 each group) 


SUBTESTS (10- items each) 


A 


B 


c 


D 


E 


I 


X 








X 


II 


X 


X 








III 




X 


X 






IV 






X 


X 




V 








X 


X 



For each of the groups an arithmetic mean and a variance estimate is computed. 
The arithmetic mean of the five-group means is the best estimate of the total 
class's average performance. The arithmetic mean of the five group variance 
estimates is the best estimate of the total class's variance. It is important 
in this process to have every test item responded to as often as every other 
item; thus,^if necessary, data should be deleted to satisfy this objective. 

There is an alternative procedure which can be used, and there are in- 
stances in which it may be more desirable. Consider an experimental group 
of 500 subjects. Instead of using matrix sampling to estimate the means and 
variance of the 500 subjects on the 50 ’items, just the mean of each item is 
estimated. This results in 50 numbers, each number an estimate of the mean 
of an item for the 500 subjects. The same procedure can be followed for 
500 subjects in a control group. Thus, 50 pairs of numbers are obtained, 
two estimated item means for each of the 50 items. A Jz-test for matched 
pairs can bs performed on these data to examine the hypothesis of no differ- 
ence between the groups. Since the item data can be useful for diagnostic 
purposes in the examination of the pilot program, this approach will be 
valuable for many evaluation situations. 
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IN RETROSPECT: 



As was indicated in the very first paragraph, the suggestions found 
herein are offered as a frame of reference in the development and execution 
of an evaluation plan. Its primary emphasis is to assist in the development 
of an evaluation design. No attempt was made to inform the reader in regard 
to the development of instruments or the analysis and interpretation of the 
results. The inclusion of the numerous references in the Selected Biblio- 
graphy should serve well for further in-depth study of any aspect of evalu- 
ation. Also, a deliberate effort was made to avoid advocating the use of 
any of the current nationwide evaluation models. 

Evaluation requires a systematic procedure for marshaling and present- 
ing objective evidence to support any judgment about the effectiveness of 
a program. Its ultimate justification rests in determining how much is 
accomplished relative to the program objective, the changes which occurred, 
and whether or not the changes were the ones which were intended. Neverthe- 
less, it should be reiterated that the potential evaluator should also be 
alerted to the importance of assessing the process or effort associated with 
the program. Although the evaluation of the processes has been compared to 
M the measurement of the number of times a bird flaps his wings without making 
any attempt to determine how far the bird has flown,” an analysis of the 
process can reveal why a program is not working as well as expected. Locating 
the cause of the failure can result in modifying the program while it is in 
progress so it will work. This procedure is a pertinent part of the evalu- 
ation plan. It requires a breakdown and assessment of the component parts 
of the program and the identification of those aspects which contribute or 
detract from the overall effect of the program. 
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